Inverse Sequence Alignment from Partial Examples

نویسندگان

  • Eagu Kim
  • John D. Kececioglu
چکیده

When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverse Parametric Alignment for Accurate Biological Sequence Comparison

For as long as biologists have been computing alignments of sequences, the question of what values to use for scoring substitutions and gaps has persisted. In practice, substitution scores are usually chosen by convention, and gap penalties are often found by trial and error. In contrast, a rigorous way to determine parameter values that are appropriate for aligning biological sequences is by s...

متن کامل

An inverse problem of identifying the coefficient of semilinear parabolic equation

    In this paper, a variational iteration method (VIM), which is a well-known method for solving nonlinear equations, has been employed to solve an inverse parabolic partial differential equation. Inverse problems in partial differential equations can be used to model many real problems in engineering and other physical sciences. The VIM is to construct correction functional using general Lagr...

متن کامل

Simple and Fast Inverse Alignment

For as long as biologists have been computing alignments of sequences, the question of what values to use for scoring substitutions and gaps has persisted. While some choices for substitution scores are now common, largely due to convention, there is no standard for choosing gap penalties. An objective way to resolve this question is to learn the appropriate values by solving the Inverse String...

متن کامل

Molecular phylogeny of some avian species using Cytochrome b gene sequence analysis

Veritable identification and differentiation of avian species is a vital step in conservative, taxonomic, forensic, legal and other ornithological interventions. Therefore, this study involved the application of molecular approach to identify some avian species i.e. Chicken (Gallus gallus), Muskovy duck (Cairina moschata), Japanese quail (Coturnix japonica), Laughing dove (Streptopelia senegale...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007